Model Selection

4-bit quantization

# 4-bit quantization

Diffucoder 7B Cpgrpo 4bit

DiffuCoder-7B-cpGRPO-4bit is a 4-bit quantized version converted from the Apple DiffuCoder-7B-cpGRPO model, optimized for the MLX framework.

Large Language Model Other

Hunyuan A13B Instruct 4bit

The 4-bit quantization version of Tencent Hunyuan A13B large language model, suitable for instruction following tasks

Large Language Model

Llama3 1 Turkish ChatBot

A Turkish language educational Q&A chatbot fine-tuned based on the Meta LLaMA 3.1 8B large language model, optimized specifically for Turkish language education scenarios.

Large Language Model Other

Qwen3 235B A22B 4bit DWQ 053125

This is a 4-bit quantized version converted from the Qwen3-235B-A22B-8bit model, optimized for the MLX framework and suitable for text generation tasks.

Large Language Model

Qwen3 30B A3B Abliterated Fp4

This is a 4-bit quantized model of Qwen3-30B-A3B-abliterated, with a parameter scale equivalent to 8B, suitable for text generation tasks.

Large Language Model

Deepseek R1 0528 Qwen3 8B 4bit

This model is a 4-bit quantized version converted from DeepSeek-R1-0528-Qwen3-8B, optimized for the MLX framework and suitable for text generation tasks.

Large Language Model

Deepseek R1 0528 Qwen3 8B MLX 4bit

A large language model developed by DeepSeek AI, optimized with 4-bit quantization, suitable for Apple chip devices.

Large Language Model

lmstudio-community

Deepseek R1 0528 4bit

DeepSeek-R1-0528-4bit is a 4-bit quantized model converted from DeepSeek-R1-0528, optimized for the MLX framework.

Large Language Model

Devstral Small 2505 4bit DWQ

This is a 4-bit quantized language model in MLX format, suitable for text generation tasks.

Large Language Model Supports Multiple Languages

Medgemma 4b It 4bit

MedGemma-4B-IT-4bit is a vision-language model specifically designed for the medical field, supporting image and text processing, and suitable for tasks such as medical image analysis.

Qwen3 235B A22B 4bit DWQ

Qwen3-235B-A22B-4bit-DWQ is a 4-bit quantized version converted from the Qwen3-235B-A22B-8bit model, suitable for text generation tasks.

Large Language Model

Qwen3 4B 4bit DWQ

This model is a 4-bit DWQ quantized version of Qwen3-4B, converted to the MLX format for easy text generation using the mlx library.

Large Language Model

Qwen3 30B A3B 4bit DWQ 05082025

This is a 4-bit quantized model converted from Qwen/Qwen3-30B-A3B to MLX format, suitable for text generation tasks.

Large Language Model

Qwen3 30B A3B 4bit DWQ 0508

Qwen3-30B-A3B-4bit-DWQ-0508 is a 4-bit quantized model converted from Qwen/Qwen3-30B-A3B to MLX format, suitable for text generation tasks.

Large Language Model

Qwen3 30B A3B MNN

An MNN model exported from Qwen3-30B-A3B, featuring 4-bit quantization for efficient inference.

Large Language Model English

Qwen3 8B 4bit DWQ

Qwen3-8B-4bit-DWQ is a 4-bit quantized version of Qwen/Qwen3-8B converted to the MLX format, optimized for efficient operation on Apple devices.

Large Language Model

Phi 4 Mini Reasoning MLX 4bit

This is a 4-bit quantized version in MLX format converted from the Microsoft Phi-4-mini-reasoning model, suitable for text generation tasks.

Large Language Model

lmstudio-community

Josiefied Qwen3 1.7B Abliterated V1 4bit

4-bit quantized version based on Qwen3-1.7B, a lightweight large language model optimized for the MLX framework

Large Language Model

Qwen3 8B 4bit AWQ

Qwen3-8B-4bit-AWQ is a 4-bit AWQ quantized version converted from Qwen/Qwen3-8B, suitable for text generation tasks in the MLX framework.

Large Language Model

Qwen3 235B A22B 4bit

This model is a 4-bit quantized version of Qwen/Qwen3-235B-A22B converted to MLX format, suitable for text generation tasks.

Large Language Model

This is the 4-bit quantized version of the Qwen/Qwen3-8B model, converted to the MLX framework format, suitable for efficient inference on Apple silicon devices.

Large Language Model

Qwen3 30B A3B 4bit

Qwen3-30B-A3B-4bit is a 4-bit quantized version converted from Qwen/Qwen3-30B-A3B, suitable for efficient text generation tasks under the MLX framework.

Large Language Model

Qwen3-4B-4bit is a 4-bit quantized version converted from Qwen/Qwen3-4B to the MLX format, designed for efficient operation on Apple chips.

Large Language Model

Qwen3 1.7B 4bit

Qwen3-1.7B-4bit is a 4-bit quantized version of the Tongyi Qianwen 1.7B model, which has been converted to the MLX framework format for efficient operation on Apple Silicon devices.

Large Language Model

Qwen3 14B MLX 4bit

Qwen3-14B-4bit is a 4-bit quantized version of the Qwen/Qwen3-14B model converted using mlx-lm, suitable for text generation tasks.

Large Language Model

lmstudio-community

The 4-bit quantized version of the MNN model for Qwen3-4B, used for efficient text generation tasks

Large Language Model English

Internvl2 5 1B MNN

A 4-bit quantized version based on InternVL2_5-1B, suitable for text generation and chat scenarios.

Large Language Model English

GLM Z1 32B 0414 4bit

This model is a 4-bit quantized version converted from THUDM/GLM-Z1-32B-0414, suitable for text generation tasks.

Large Language Model Supports Multiple Languages

Dia-1.6B-4bit is a 4-bit quantized text-to-speech model based on the MLX format, converted from nari-labs/Dia-1.6B.

Speech Synthesis English

Hidream I1 Full Nf4

HiDream-I1 is an open-source image generation foundation model with 17 billion parameters, capable of producing industry-leading images in seconds.

Image Generation

Hidream I1 Fast Nf4

HiDream-I1 is an open-source image generation foundation model with 17 billion parameters. The 4-bit quantized version can run on 16GB VRAM, enabling fast and high-quality image generation.

Image Generation

Hidream I1 Dev Nf4

HiDream-I1 is an open-source image generation foundation model with 17 billion parameters, capable of producing industry-leading images in seconds.

Image Generation

Zhaav Gemma3 4B

A Persian-specific model fine-tuned based on the Gemma 3 architecture, utilizing QLoRA 4-bit quantization technology, suitable for running on ordinary hardware.

Large Language Model Other

This is the 4-bit quantized version of the Qwen/QwQ-32B model, optimized using the BitsAndBytes library, suitable for text generation tasks in resource-constrained environments.

Large Language Model

Transformers English

Qwq 32B Bnb 4bit

The 4-bit quantized version of Qwen/QwQ-32B, implemented using the BitsAndBytes library, suitable for text generation tasks in resource-constrained environments.

Large Language Model

Transformers English

Gemma 3 4b Persian V0 GGUF

This is a statically quantized version of the mshojaei77/gemma-3-4b-persian-v0 model, specifically optimized for Persian text generation tasks.

Large Language Model

Transformers Other

Gemma 3 27b It Quantized W4A16

Gemma 3 is an instruction-tuned large language model developed by Google. This repository provides its 27B parameter W4A16 quantized version, significantly reducing hardware requirements

Large Language Model

Gemma 3 4b Persian V0

A Persian-specific model based on the Gemma 3 architecture, utilizing QLoRA for 4-bit quantization, focused on Persian text generation and understanding

Large Language Model Other

Olmo 2 0325 32B Instruct 4bit

This is a 4-bit quantized version converted from the allenai/OLMo-2-0325-32B-Instruct model, optimized for the MLX framework and suitable for text generation tasks.

Large Language Model

Transformers English

Qwq 32B Bnb 4bit

4-bit quantized version of QwQ-32B, optimized using Bitsandbytes technology, suitable for efficient inference in resource-constrained environments

Large Language Model

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase